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(57) A method and apparatus is described for seg- 
menting an image, for adaptively scaling an image, and 
for automatically scaling and cropping an image based 
on codestream headers data. In one embodiment, a file 
that can provide a header that contains multi-scale en- 
tropy distribution information on blocks of an image is 
received. For each block, the block is assigned to a 
scale from a set of scales that maximizes a cost function . 
The cost function is a product of a total likelihood and a 
prior. The total likelihood is a product of likelihoods of 
the blocks. The image is segmented by grouping togeth- 
er blocks that have been assigned equivalent scales. In 
one embodiment, the file represents an Image in JPEG 
2000 format. 
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Description 

RELATED APPLICATIONS 

5 [0001] This application Is related to the co-pending application entitled Content And Display Device Dependent Cre- 
ation Of Smaller Representations Of Images, concun^ently filed on January 10, 2002, U.S. Patent Application Serial 
No. , assigned to the corporate assignee of the present Invention. 

FIELD OF THE INVENTION 

10 

[0002] The Invention relates generally to the field of image processing. More specifically, the Invention relates to 
processing Images using multi-scale transforms. 

BACKGROUND OF THE INVENTION 

15 

[0003] Digital Images can be represented and stored In a variety of formats. A common feature In digital Image 
representation fomnats is that the bits constituting an Image file are divided into image description bits and header bits. 
Image description bits describe the actual underlying Image. Often the image description bits are divided into smaller 
units for convenience. Header bits provide organizational information about the image, such as image size in pixels, 

20 file size, length in bits for the various smaller image description units, etc. 

[0004] Compressed Image files contain a wide variety of organizational Inf omiatlon In the header primarily to facilitate 
convenient file management and interpretation. For example, in addition to conventional Information such as width, 
height, color component information and other details. JPEG 2000 ITU-T Rec. T.800 I (ISO/iEC 15444-1 :2000) image 
headers also provide infonnation about the number of bits contained in smaller units, such as groups of wavelet coef- 

25 flclents (temned codeblocks), that constitute compressed data for image and the wavelet-domain locations of these 
small units of coefficients. Other Image file formats can contain similar infonnation. 

[0005] In R. De Quelroz and R. Eschbach, "Fast segmentation of the JPEG compressed documents," Electronic 
Imaging, vol. 7, pp. 367-377, April 1 998, segmentation of conventional JPEG compressed documents using the entropy 
of 8x8 blocks in the image Is described. The technique described therein does not use header-based processing, as 
30 the entropy values are not available in the conventional JPEG image header Also, the technique employs a discrete 
cosine transform ("DOT') used by conventional JPEG that operates only on local 8x8 blocks. Hence, the technique 
does not . use multi-scale transforms. Furthemnore, the technique only uses the available entropy distributions on 8x8 
blocks in the image domain and does not have access to any multi-scale bit distribution. 

[0006] Image analysis involves describing, interpreting, and understanding an image. Image analysis extracts meas- 
35 urements, data or information from an image. Image analysis techniques involve feature extraction, segmentation and 
classification. Image analysis may be referred to as computer vision, Image data extraction, scene analysis, image 
description, automatic photointerpretation, region selection or image understanding. See W. Pratt, Digital Image 
Processing, (2"^ Edition), John Wiley & Sons, Inc., New Yori<, NY, 1995, and A. Jain, Fundamentals of Digital Image 
Processing, Prentice Hall, Englewood Cliffs, NJ, 1995. 
40 [0007] Image processing produces a modified output image from an input image. Image processing techniques in- 
clude cropping, scaling, point operations, filtering, noise removal, restoration, enhancement. (Jain chapters 7 and 8; 
Pratt Part 4.) 

[0008] In some applications, It is desirable for first perfomi image analysis on an image and then to use the analysis 
to control image processing on the image. For example, the program "pnmcrop" (http://www.acme.com/software/pbm- 
45 plus/) first analyzes an image to find stripes of a background color (a single color value, for example white or black) 
on all four sides. Then It performs an Image processing operation, cropping, on the image to remove the stripes. 

SUiy/IMARY OF THE INVENTION 

50 [0009] A method and apparatus is disclosed herein for performing operations such as image segmentation, adaptive 
scale selection, and automatic region selection and scaling on the underlying image using only the image file header 
infonnation. The image files use a multi-scale image compression technique. A multi-scale bit allocation, which is used 
for processing, is estimated from the file header. The processing algorithms use the number of bits allocated by the 
image coder (or, In another embodiment, estimated to be allocated) as a quantitative measure for the visual importance 

55 of the underlying features. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

[0010] The present invention will be understood more fully from the detailed description given below and from the 
accompanying drawings of various embodiments of the invention, which, however, should not be taken to limit the 
5 invention to the specific embodiments, but are for explanation and understanding only 

Figure 1 illustrates a multi-scale entropy distribution for an image; 

Figure 2 is a flow diagram illustrating one embodiment of a process for segmenting an image; 
Figure 3 illustrates a segmentation map superimposed on an exemplary image of a woman; 
10 Figure 4 illustrates a segmentation map superimposed on an exemplary image of Japanese text; 

Figure 5 is a flow diagram of one embodiment of a process for adaptively scaling an image; 
Figure 6 illustrates adaptive scaling of an exemplary Image of a woman; 
Figure 7 illustrates adaptive scaling of an exemplary image of Japanese text; 

Figure 8 is a flow diagram of one embodiment of a process for automatically scaling and cropping an image; 
IS Figure 9 illustrates automatic scaling and cropping of an exemplary image of a woman; 

Figure 1 0 Illustrates automatic scaling and cropping of an exemplary image of Japanese text; 
Figure 11 A Is a block diagram of one embodiment of an apparatus to perfomn the processing described herein; 
Figure 11 B is a block diagram of an alternative embodiment of an apparatus to perform the processing describe 
herein; and 

20 Figure 1 2 is a block diagram of a computer system. 

DETAILED DESCRIPTION 

[0011] A method and apparatus for using file header infonnatlon to process an underlying digital image is described. 
25 The file header infonnatlon may be part of a bit stream that includes compressed data corresponding to the underlying 
digital image. The processing described herein uses the Information in the header and process it in a specific way to 
determine what portions of the compressed data to decode, tn essence, the information in the header enables identi- 
fication of a region or regions upon which further processing Is to occur. 

[0012] In one embodiment, the compressed data comprises an image representation format resulting from multi- 
30 scale transfomri-based compression. Compressed data consists of header and image description bits. That is, multi- 
scale transfonned based compression is applied to image data as part of the process of generating the image descrip- 
tion bits. From the header, the Image coder's entropy distribution, or bit allocation, in the multi-scale domain may be 
estimated and used as a quantitative measure for visual Importance of the underlying Image features. For example, 
from the header of a JPEG 2000 file information such as, the length of codeblocks, the number of zero bit planes, the 
35 number of coding passes, may be used to determine the entropy distribution. In this manner, the bit distribution in a 
multi-scale transfonn based representation is used to perform one or more operations, including, but are not limited 
to, image segmentation, adaptive scale/resolution selection for images, and automatic scaling and detection and se- 
lection, scaling and cropping of important image regions. 

[0013] In one embodiment, information in the header is used to generate an entropy distribution map that indicates 

40 which portions of the compressed image data contain desirable data for subsequent processing. An example of such 
a map is given in Figure 1 . Other maps are possible and may Indicate the number of layers, which are described below 
with the description of JPEG 2000, to obtain a desired bit rate (particularly for cases when layer assignment is related 
to distortion) or the entropy distribution for each of a number of bit rates. In the latter case, each rectangular area on 
the map has a vector associated with it. The vector might indicate values for multiple layers. 

45 [0014] Image representation formats that utilize multi-scale transforms to compress the Image description bits typi- 
cally incorporate many organizational details in the header, so that pixel-wise description about the digital image can 
be decoded correctly and conveniently. JPEG 2000 is an example of an Image compression standard that provides 
multi-scale bit distributions in the file header. Often the image description bits are divided among smaller units, and 
the number of bits allocated by the encoder to these units is stored in the image header to facilitate features such as 

50 partial Image access, adaptation to networked environments, etc. Using infonmation theoretic conventions, the allocated 
number of bits is refen^ed to as the entropy of each small unit. Entropy distributions used by image coders provide an 
excellent quantitative measure for visual importance in the compressed Images. For lossless compression, an image 
coder uses more bits to describe the high activity (tot of detail) regions, and less bits to convey the regions with little 
detail infonmation. For tossy compression, the image coder typically strives to convey the best possible description of 

55 the image within the allocated bits. Hence, the coder Is designed to Judiciously spends the available few bits describing 
visually important features in the image. 

[0015] A multi-scale image coder does not code image pixels, but coeffrcients of the transfomned image where the 
transfonn perfonns a separation of image infomnation into various frequency bands. Multi-scale image coders (e.g., a 
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JPEG 2000 coder) provide the multi-scale distribution of entropy for the underlying image in the image header Since 
such transform basis functions exhibit simultaneous spatial and frequency localization, the transform coefficients con- 
tain infonnation about the frequency content at a specified location in the Image. 

[0016] The ability to process an image simply based on its header is desirable, because not only is the header 
5 infomnation easily accessed using a small number of computations, but also the condensed nature of the available 
image infonnation enables more efficient subsequent processing. Importantly, the header Information, which is easy 
to access, indicates information about the image without decoding coefficients. Therefore, processing decisions can 
be made without having to expend a large amount of time decoding coefficients. 

[001 7] The techniques described herein have applications in areas such as, but not limited to, display-adaptive image 
10 representations, digital video surveillance, image database management, image classification, Image retrieval, and 
preprocessing for pattern analysis, image filtering and sizing. 

[0018] In the following description, numerous details are set forth. It will be apparent, however, to one skilled in the 
art, that the present invention may be practiced without these specific details. In other instances, well-known structures 
and devices are shown In block diagram fomri, rather than in detail, in order to avoid obscuring the present invention. 

15 [0019] Some portions of the detailed descriptions which follow are presented in temns of algorithms and symbolic 
representations of operations on data bits within a computer memory. These algorithmic descriptions and representa- 
tions are the means used by those skilled in the data processing arts to most effectively convey the substance of their 
work to others skilled in the art An algorithm is here, and generally, conceived to be a self-consistent sequence of steps 
leading to a desired result. The steps are those requiring physical manipulations of physical quantities. Usually, though 

20 not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, 
combined, compared, and othenwise manipulated. It has proven convenient at times, principally for reasons of common 
usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like. 
[0020] It should be borne in mind, however, that all of these and similartemfis are to be associated with the appropriate 
physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise 

25 as apparent from the following discussion, it is appreciated that throughout the description, discussions utilizing terms 
such as "processing" or "computing" or "calculating" or "detemnining" or "displaying" or the like, refer to the action and 
processes of a computer system, or similar electronic computing device, that manipulates and transfomns data repre- 
sented as physical (electronic) quantities within the computer system's registers and memories into other data similarly 
represented as physical quantities within the computer system memories or registers or other such information storage, 

30 transmission or display devices. 

[0021] The present invention also relates to apparatus for performing the operations herein. This apparatus may be 
specially constructed for the required purposes, or it may comprise a general purpose computer selectively activated 
or reconfigured by a computer program stored in the computer Such a computer program may be stored in a computer 
readable storage medium, such as, but is not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, 

35 and magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs), EPROMs, EEPROMs, 
magnetic or optical cards, or any type of media suitable for storing electronic instructions, and each coupled to a 
computer system bus. 

[0022] The algorithms and displays presented herein are not inherently related to any particular computer or other 
apparatus. Various general purpose systems may be used with programs in accordance with the teachings herein, or 
40 it may prove convenient to construct more specialized apparatus to perform the required method steps. The required 
structure for a variety of these systems will appear from the description below. In addition, the present invention Is not 
described with reference to any particular programming language. It will be appreciated that a variety of programming 
languages may be used to implement the teachings of the invention as described herein. 

[0023] A machine-readable medium includes any mechanism for storing or transmitting information in a form readable 
45 by a machine (e.g., a computer). For example, a machine-readable medium includes read only memory ("ROM"); 
random access memory ("RAM"); magnetic disk storage media; optical storage media; flash memory devices; electrical, 
optical, acoustical or other fonn of propagated signals (e.g., carrier waves, infrared signals, digital signals, etc.); etc. 
[0024] Figure 1 illustrates one multi-scale entropy distribution for an image. The Image undergoes JPEG 2000 en- 
coding initially. The underlying patterns are the wavelet coefficients of the image. The thin lines denote the JPEG 2000 
so division of the wavelet domain coefficients into code blocks, and the thick lines separate the different wavelet sub- 
bands. In JPEG 2000, the coder perfomning the encoding process allocates and divides the wavelet domain coefficients 
Into small units called code blocks. The numbers shown in each square are the bits or entropies allocated to the 
respective code blocks by the JPEG 2000 coder operating at 0.5 bits per pixel using three levels of decomposition. 
These numbers represent the multiscale entropy distribution. 
55 [0025] The entropy allocations, which are accessed using only the JPEG 2000 file header, provide a good measure 
for the visual importance of the different features at various scales and help distinguish between the different types of 
important image features characterized by different multiscale properties. For example, to describe the feather region 
In the image, a multi-scale Image coder spends many bits coding the fine scale coefficients and less on coarse scale 
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coefficients than, e.g., fine scale coefficients corresponding to the feather region. On the other hand, to code the face 
region, a nnulthscale image coder spends more bits coding the intemriedlate scale coefficients corresponding to the 
face region. The smooth background receives few bits. Thus, the multi-scale entropy distribution provides significant 
information about the underlying Image features. Assuming knowledge of the multi-scale entropy distribution is obtained 
s from headers, one or more operations may be perfonned. These operations may be, for example, Image segmentation, 
automatic active region identification and scaling, and/or adaptive image scaling. 

[0026] JPEG 2000 is a standard to represent digital images in a coherent code-stream and file format (See, e.g., 
ITU-T Rec. T800 1 ISO/IEC 15444-1 :2000, "JPEG 2000 image coding standard," in www.iso.ch). JPEG 2000 efficiently 
represents digital image by efficiently coding the wavelet coefficients of the image using the following steps. A typical 

10 image consists of one or more components (e.g., red, green, blue). Components are rectangular arrays of samples. 
These arrays are optionally divided further into rectangular tiles. On a tile-by-ti!e basis, the components are optionally 
decorrelated with a color space transfonnatlon. Each tlle-component Is compressed independently. Wavelet coefficients 
of each color component in the tile are obtained. The wavelet coefficients are separated into local groups in the wavelet 
domain. These are called code blocks. The code blocks are optionally ordered using precincts. Arithmetic coding is 

IS used to code these different wavelet-coefficient groups Independently. The coded coefficients are optionally organized 
into layers to facilitate progression. Coded data from one layer of one resolution of one precinct of one component of 
one tile is stored In a unit called a packet. In addition to coded data, each packet has a packet header. After coding, a 
tile-component is optionally divided into tile-parts, otherwise the tile-component consists of a single tile-part. A tile-part 
is the minimum unit in the code-stream that con^esponds to the syntax. A JPEG 2000 codestream consists of syntax 

20 (main and tile-part headers, plus EOC) and one or more bitstreams. A bitstream consists of packets (coded data for 
codeblocks, plus any instream markers Including instream packet headers). The organizational information to parse 
the coded data, the packet headers, may be stored In the main header, tile headers, or in-stream. 
[0027] JPEG 2000 has main headers and tile headers which contain marker segments. JPEG 2000 also has packet 
headers which may be contained In maricer segments or be in-stream in the bit stream. Headers are read and used 

25 as Inputs to processing which obtains a multlscale entropy distribution. Table 1 summarizes the infonnation contained 
in various JPEG 2000 headers that is relevant to header-based processing. 
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Table 1: Uses of JPEG 2000 file header infoimation 





Header 
Entries 


Type of 
Infonnation 


Role to Entropy Estimation 


Main 


Tilc 


In- 

stream 


5 
10 


Packet 

header 

(PPM. 

PPT^in- 

stream) 


Length of 
coded data; 
number of 
zero bit 
planes and 
coding 
passes 


Provides entropy of each code 
block of each sub-band of each 
component of tUe. Facilitates 
estimation of entropy allocation 
at lower bit rates. Provides 
rough estimate of coefficient 
energies and magnitudes. 








15 


Packet 
length 
(PLM, 
PLT) 


Lengths of 
packets 


Facilitates faster estimation of 
code block entn^ies for some 
JPEG 2000 fUes 










Tile- 
. length 


Lengths of 
tiles 


Provides entropy of each tile. 
Facilitates local and global 








20 


part 
(TLM, 

S(3TD 




entropy comparison 










SJZ 


Sizioof 
image 


Helps determine location of code 
blocks 








25 


COD, 
COC. 
QCC, 
QCD 


Coding style 


Number of transform levels, 
code block size* maximum size 
of coefficients, precinct 
information 








30 


RGN 


Region 
infonnation 


Estimate size and importance of 
region of interest. - Alters 
meaning of most of the above 
information 


✓ 







55 In the case of the packet header (PPM, PPT, in-stream), it may be in either the main header, tile header or in-stream, 
but not a combination of any two or more of these at the same time. On the other hand, the packet length and tile- 
length part may be In the main header or the tile headers, or in both at the same time. 

Estimation of Low Bit Rate image From High Bit Rate image 

40 

[0028] The multi-scale entropy distribution at lower bit rates provides a robust measure for visual Importance. At 
higher bit rates the existence of image noise, which is present in digital images from any sensor or capture device, 
corrupts the overall entropy distribution. Depending on the application, Images are encoded losslessly or lossy. The 
layering scheme in the JPEG 2000 standard could be used to order the codestream of a lossless or high bit rate 
45 encoded image into layers of visual or Mean-Squared-Error (MSE)-based importance, in this case, a low bit rate version 
of the image could be obtained by extraction of infonnation from only the packets In some layers and ignoiing the 
packets in the other layers, if such layering is not employed by the encoder, the packet length infonnation from the 
header can yield the multi-scale entropy distribution only at the bit rate chosen by the encoder, e.g. lossless, high bit 
rate or low bit rate. 

50 [0029] if the encoder choice was lossless or high bit rate, an estimation of a low bit rate version of the image is 
obtained before applying any of the image processing algorithms explained later. One embodiment for perfomning such 
an estimation is described below. To detennine the order in which bits are allocated, infonnation of the maximum of 
absolute values of coefficients and the number of coding passes in a codeblock from headers as well as heuristic and 
statistical infonnation on visual or (MSE)-based importance of subbands at various resolution levels is used. 

55 [0030] The estimation successively subtracts bits from the total number of bits per codeblock until a given bit rate 
for the image is reached. The order of subtraction is the reverse of a bit allocation algorithm. The allocation algorithm 
may be the same as the one used by the encoder, but it is not required to be. 

[0031] From the packet header of a JPEG 2000 file the length of a codeblock, i.e. the number of bits "B", number of 
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zero bitplanes "NZ" and the number of coding passes "CP" used during encoding are available. From the number of 
zero bitplanes, an estimation of the maximum value of absolute values of coefficients In the codeblock, 2"^^^, can be 
obtained by computing the maximum non-zero bitplane 

^ MaxB = MSB(codeblock subband) - NZ, (1) 

where MSB is the maximum number of bitplanes of the specific subband of which the codebock belongs. MSB is 
defined by information in the appropriate QCC or QCD header entry for JPEG 2000. Based on visual or MSE-based 
10 weighting or statistical properties of images, an order of subbands and bitplanes can be derived that reflects the im- 
portance of a bit plane in a given subband. Based on, e.g., MSE importance, the ordering of importance of bit planes 
in a subband of a 5-level decomposition is given by the one displayed in Table 2. 



Table 2- 



15 


Order of Importance of bitplanes and subbands based on MSE weighting. 




order In 1 (least important, 1=1 ; to most important) 


bitplane b(i] 


subband s(i) 


level l(i) 




1 


ist Ditpiane 


MM 


level 1 




2 


1st bitplane 


LH/HL 


level 1 


20 


3 


1st bitplane 


HH 


level 2 




4 


2nd bitplane 


HH 


level 1 




5 


1 St bitplane 


LH/HL 


level 2 




6 


1 St bitplane 


HH 


level 3 


25 


7 


2nd bitplane 


1 t i/i II 
LH/HL 


level 1 




8 


2nd bitplane 


1 It 1 
HH 


level 2 




9 


1 St bitplane 


LH/HL 


level 3 




10 


1 St bitplane 


HH 


level 4 




11 


3rd bitplane 


HH 


level 1 


30 


12 


2nd bitplane 


LH/HL 


level 2 




13 


2nd bitplane 


HH 


level 3 




14 


1st bitplane 


LH/HL 


level 4 




15 


1st bitplane 


HH 


level 5 


35 


16 


3rd bitplane 


LH/HL 


level 1 


17 


3rd bitplane 


HH 


level 2 




18 


2nd bitplane 


LH/HL 


level 3 




19 


2nd bitplane 


HH 


level 4 




20 


4th bitplane 


HH 


level 1 


40 


21 


3rd bitplane 


LH/HL 


level 2 




22 


3rd bitplane 


HH 


level 3 




23 


2nd bitplane 


LH/HL 


level 4 




24 


2nd bitplane 


HH 


level 2 


45 


25 


4th bitplane 


LH/HL 


level 1 


26 


4th bitplane 


HH 


level 2 




27 


3rd biplane 


LH/HL 


level 3 




28 


3rd bitplane 


HH 


level 4 




29 


2nd bitplarie 


LH/HL 


level 5 


50 











[0032] The estimation algorithm uses that order and computes for each codeblock for order number i, the number 
of coding passes CP(b(i)) that contain the specific bitplane. b(l), in the subband, s(i), and the con^esponding level, 1 
(i), namely 

CP{b(i)) = CP-((MaxB(s(l),l{l))-b(i)r3+1) (2) 
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[0033] If that number is positive, a specific number of bits is subtracted from the codeblocic bits. In one embodiment, 
the specific number of bits is computed as the average number of bits per coding pass in the specific subband, or the 
specific resolution. In the next step, order number (i+1 ), the derived number of bits is subtracted in a similar way from 
the codeblocks for bitplane b(i+1 ) of subband s(l+1 ) at level 1(1+1 ). In pseudo code, an exemplary estimation algorithm 
5 for the example target rate of 0.5bits/pixel is expressed as follows. 



10 



!fax_X=lar9est.orderjnumber 
targec^rate » 0.5 
nev_B o B; 
newjCP = CP; 

while ((i^inas^i) && (new:.rate>target_^ate) ) ( 
for each codeblock m in subband s(i) 

elinuCPCro) (b(i)) = new_CP[m] -( {MaxB(s(i) , 1 (i) )-b(i) ) *3+l) ; 
i£(elinL-CP(m] (b(i)) > 0) 

av_bits = new_B(m) (s<i)>/new_CP(mJ{s(i)); 
• new_Btm] avJbits*elinL.CPCmI |b(i) ) ; 
i£(new_B(ni]<0) newj(ml =0; 
newjCPCm] -= elim.cp{za] (b(i} ) ; 

end 

end 

20 new^ate » s\m < new^« 8 ) / ImageSi ze ; 

end 



New_B and new_CP are an-ays of size of the number of codeblocks. 
25 [0034] Once the target rate Is reached, the new estimated bit values "new.B" are used In the entropy processing 
algorithms. 

[0035] There are many alternatives to estimating a low bit rate image from a high bit rate image. In an alternative 
embodiment, another approach for estimation of low bit rate images may be used. This approach uses a model on the 
distribution of wavelet coefficients of an image. 
30 [0036] It is assumed that the distribution of the wavelet coefficients can be described by a Gaussian or Laplacian 
distribution. The latter one is often used for modeling in the literature since distributions of many natural images are 
tested to follow the exponential distribution approximately The Laplacian distribution has density 



35 f(x)=Xe^'*'forX>0 (3) 



[0037] The theoretical definition of the entropy is 



40 H = - Z p, log(p,) (5) 

where p, is the probability of an event A,, i.e. p, = P(A,). For a lossy compressed image, the events are the situations 
that coefficients fall into specific quantization bins. In the case of scalar quantization with quantizer Q the event A, is 
described as the event that a coefficient is in the interval [1*2^, (i+1)*2^). i.e. 

45 

p, = P(A,) = P(wavelet coefficient d e [i*2^, (i+1)*2^)) (5) 
For the Laplacian distribution, this results in 

50 

p,=e^'^-e-^<'*^>^ (6) 

[0038] If the parameter X could be estimated from the header data of a coding unit, then the pdf of the coefficients 
in that coding unit could be estimated and the entropy for any given quantizer Q be determined. 
[0039] The packet headers of a JPEG 2000 file include information on the number of zero bitplanes in a codeblock. 
From this information an estimation on the maximum absolute values of coefficients in that codeblock can be obtained 



8 



EP 1 329 847 A1 

by the variable MaxB from Equation 1 . Using this variable, the parameter X can be estimated as 

k* = log2(#coefficients per codeblock)/(2'^MaxB) (7) 

5 

[0040] By inserting this estimate into the formulas in Equations (6) and (4), an estimate for the entropy given a specific 
quantization Is obtained. The value H yields bits per pixel. Since the codeblock length Is measured In bytes, the esti- 
mated value H has to be multiplied by 8*(#coefficients per codeblock). A final algorithm may use the same order as 
the previously described method to reduce the number of bits in different subbands at different resolution levels sue- 
10 cesslvely. The reduction of bits Is given by setting the quantizer to the bitplane parameter b(l) from Table 2. 

Image Analysis Processing Algorithms 

[0041] By exploiting the multi-scale entropy distribution that is accessible from the header, techniques may be used 
1^ to perform image analysis or computer vision and similar operations such as, for example, but not limited to, segmen- 
tation, automatic scaling, resolution selection, and automatic region selection and cropping on the underlying Image. 
Common prior art techniques are described In W. Pratt, Digital Image Processing, {2^^ Edition), John Wiley & Sons, 
Inc., New York, NY, 1 995, and A. Jain, Fundamentals of Digital Image Processing, Prentice Hall, Englewood Cliffs, NJ, 
1995. In one embodiment, instead of the exact sample-wise multi-scale entropy distribution, the entropy distribution 
^0 over local blocks of multi-scale coefficients (such as code blocks in JPEG 2000), a granular entropy distribution, is 
available. In one embodiment, the granular entropy distribution is used to process the underlying image. 
[0042] As described herein, the use of multi-scale infonnation from an image available In JPEG 2000 headers is 
demonstrated in the framework of several image analysis algorithms (or computer vision). In one embodiment, the 
header parameters that are used are PPM, PPT, SIZ, COD, COC, QCC and QCD. From these parameters, the location 
25 of codeblocks in the wavelet domain and the number of bits used by the encoder to encode the corresponding coeffi- 
cients can be extracted. These numbers can be used to derive a bit distribution of the multi-scale representation of the 
image. The scale and spatial localization of codeblocks, and the multi-scale bit distribution Inferred from headers lead 
to different image processing applications such a multlscale segmentation, automatic scaling, automatic scaling and 
cropping, and production of multlscale collage. 

30 

Segmentation 

[0043] A classification technique assigns a class label to each small area in an image. Such an area can be an 
individual pixel or a group of pixels, e.g. pixels contained in a square block. Various image analysis techniques use 
^ the class assignments In different ways, for example, the segmentation techniques separate an image into regions 

with homogeneous properties, e.g. same class labels. 

[0044] Using the multi-scale entropy distribution, a scale is assigned as the class label to each image region, so that 
even if the coefficients from the finer scales is Ignored, the visual relevant information about the underlying region is 
retained at the assigned scale. Such labeling identifies the frequency bandwidth of the underlying image features. 
^0 Segmentation is posed as an optimization problem, and a statistical approach Is invoked to solve the problem. 

[0045] The location of codeblocks in the wavelet domain is given by the two-dimensional (2D) spatial location (l,k) 
and scale j. For example, if processing an image of size 512x512 and having codeblocks of size 32x32, there are 8x8 
of size 32x32 codeblocks in each band of level 1 , 4x4 codeblocks per band at level 2, and 2x2 codeblocks per band 
at level 3. The number of bits Bj(l,k) per codeblock location (i,k) at level j for the three different bands LH, HL and HH 
at level j are added to yield the number of bits necessary to code the total coefficients at wavelet domain location (i, 
k). In practice, a linear or non-linear combination of the different entropies can also be used to help distinguish between 
vertical and horizontal features. 

[0046] A scale yE{1 ... J} Is assigned to each block, so that a cost function A is maximized, 

50 
55 

where Is the optimal segmentation map for the entire image, S is one of the J^^ possible labeling of blocks of size 
Mx A/ with each block assigned one of the scales in {1...J}, and A(S,S) yields the cost given any segmentation Sand 
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any entropy distribution B. 

[0047] In one embodiment, the prior art Maximum A Posteriori ("I^^AP") approach is adopted from statistics to solve 
the segmentation problem, because such an approach can be tuned to suit the final application. The basic Ingredients 
used by MAP to set the cost function A are the lil^elihood P{B I S), which is the probability of the image's entropy 
5 distribution B, given segmentation map S, and prior P(S), which is the probability of the segmentation map S. The 
MAP cost function A is given by 



A(B,S)= P(B,S)= P(B I S)P(S) (Bayes' rule). (9) 

10 

The MAP segmentation solution corresponds to optimizing equation (8), using equation (9). 
[0048] The coefficients contained In a codeblock at level 1 contain infomiatlon about a blocic of approximately twice 
the size in the pixel domain. If the pixel domain Is divided into bloc(<s of a specific size there are four times as many 
blocks in the pixel domain than codeblocks at level 1 of the wavelet decomposition, 16 times as many blocks In the 
15 pixel domain than codeblocks at level 2 of the wavelet decomposition, etc. Therefore, bits of a codeblock Bj(i,k) of size 
n x n contribute information to a block in the pixel domain of size ^nx2ln at location {i^n,k2ln). Reversely, a pixel block 
of size n X n at location (x,y) receives a fraction of the bits, estimated as 1/4i, from codeblocks By(/,/0 with 



20 



25 



30 



[0049] In one embodiment, the number of level-j bits associated with the pixel domain is defined as 

Bj(x,y) = ^ (10) 



The above calculation is equivalent to piece wise interpolation of the entropy values. Other interpolation algorithms, 
such as, for example, polynomial interpolation or other nonlinear interpolation, can be used as well to calculate the 
level j bits. 

35 [0050] The cumulative weighted resolution-j entropy of a pixel block of size 2n x 2n at location (x,y) is given by 



B, <x.y) = £YjjB,ak) (11) 

40 hi 



with 

45 



50 for the locations i and k in Bj(i,k) in equation (1 0) and weights yjj. An example for a collection of weights is 

Yj / = 0 for 1 <j and Yj y = Wj for 1 >j (12) 

55 with Wq = 1 , w^ = 3.5, W2=5.5, W3=03, W4=20, The parameters w, and the weights Yj,/ may be changed depending on 
the application. The set of values b!'''^®*Js called the cumulative weighted entropy of the image at resolution j. ^ 
[0051] The likelihood for the entropy B^®' (x,y) of a pixel domain block at location (x,y) is set to be the value of B^^^^ 
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(x.y) relative to the total weighted bits for all levels associated with the pixel domain location (x.y), namely 



10 



P(B'^(x,y)tS(x>y) = j)= flJ^^'^^ (13) 

IBx (x.y) 

6H 



Under the assumption of the pixel domain blocks being independent, the total likelihood is given by 



pud 

P(B(S = j))=:nP(B (x.y)|(S(x,y) = j)). ^ (14) 

BPixel provides a multlscale entropy distribution for the original image. 

20 [0052] Now the prior P(s) has to be detennlned. The following discussion reflects existing knowledge about typical 
segmentation maps. There are many possible ways to choose the prior. For example, other ways to choose the prior 
are described in R. Neelamant, J. K. Romberg, H. Choi, R. RIedi, and R. G. Baraniuk, "Multlscale image segmentation 
using joint texture and shape analysis," in Proceedings of Wavelet Applications In Signal and Image Processing VIII, 
part of SPIE's Intemational Symposium on Optical Science and Technology, San Diego, CA, July 2000; H. Cheng and 

25 c. A. Bouman, "Trainable context model for multlscale segmentation," in Proc. IEEE Int. Conf. on Image Proc.-ICIP 
'98. Chicago, IL, Oct 4-7, 1 998; and H. Choi and R. Baraniuk, "Multlscale texture segmentation using wavelet-domain 
hidden Markov models," in Proc. 32nd Asilomar Conf. on Signals, Systems and Computers, Pacific Grove, CA, Nov. 
1-4, 1998. 

[0053] Because the segmentation map is expected to have contiguous regions, a prior is set on each location (x, y) 
30 based on its immediate neighborhood N(x, y), which consists of nine blocks (using reflection at the boundaries). The 
Individual prior is 



p(s(x,yjN(x,y))= y^-y^=^^^-y»\ (15) 
, X(*(N(x,y)-j))- 

40 

where #(N(x,y) = S(x, y)) is the number of neighbors which are the same as S(x,y), and a is a parameter that can be 
increased to favor contiguous regions; a = 0 Implies that the segmentation map blocks are independent of each other. 
In one embodiment, the overall prior is chosen as 



45 



P(S)=n,yP(S(x,y)IN(x,y)) (16) 



= n,y(#(N(x.y)=S(x,y))«. (17) 

50 

[0054] In one embodiment, a equals 0.02 to 0.08. The desired segmentation map can now be obtained by optimizing 
the cost function A(S, B). A number of prior art iterative techniques may be used to search for the local maxima. One 
iterative technique Involves first calculating the initial segmentation map that optimizes the cost function using a = 0 
in equation (1 2). The segmentation map maximizing the resulting cost function is obtained because the vector optimi- 
zation decouples Into a scalar optimization problem. The segmentation map Is given by 
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S'(a.b)=argaga}j: 




B (x.y)IS(x.y)-j ,foraU(x,y) 



(18) 



For all (X, y), the segmentation map at (x, y) is updated using 



S'(x,y)=argngajj: 




B (x.y)IS(x.y)=j Ks(x.y)= jlN(x.y)). 



(19), 



where N(x, y) Is obtained from S^''. Each iteration, m Is incremented to m =m+^ . The iterative loop Is repeated until 
= S^i . The Iterative algorithm always converges, because the cost function A(B, is a non-decreasing function 
with iterations m , and the cost function Is bounded. The obtained after convergence Is the segmentation estimate. 
[0055] The actual segmentation output in temns of labeling of regions Is then given by the maximization of the MAP 
cost function 



as stated in equation (3) above. , 
[0056] Figure 2 is a flow diagram of one embodiment of a process for segmenting an image. Referring to Figure 2, 
in process block 201 , a file that contains a header that contains multi-scale entropy distribution information on blocks 
of an image is received. In one embodiment, the file represents an Image In JPEG 2000 format. In process block 202, 
for each block, a scale from a set of scales is assigned to the block that maximizes a cost function. The cost function 
Is a product of a total likelihood and a prior. The total likelihood is a product of likelihoods of the blocks. In one embod- 
iment, each likelihood of a block is proportional to a summation, for each scale in the set of scales, of a product of a 
weight of the scale and a number of bits spent to code the block at the scale. In one embodiment, the number of bits 
spent to code the block at the scale Is a numerator divided by a denominator. The numerator is an entropy distribution 
of a multi-scale coefficient of the block at the scale. The denominator is four raised to the power of the scale, in process 
block 203, the image is segmented by grouping together blocks that have been assigned equivalent scales. 
[0057] Figure 3 illustrates a segmentation map superimposed on an exemplary image of a woman. In one embodi- 
ment, the segmentation process (set forth above) labels the face regions of the image 301 with finer scales, and labels 
the background regions with coarser scales to reflect the underlying features In the Image. The different shades show 
that the regions with different types of features are identified differently, in one embodiment, the segmentation process 
assigns a scale to the different regions on the basis of the underlying features. The color-bar 302 on the right shows 
the scales assigned to the different regions. Regions such as the face which contain many edges are labeled with a 
fine scale 303. In contrast, the background regions are assigned coarser scales 304. 

[0058] Figure 4 illustrates a segmentation map superimposed on an exemplary Image of Japanese text. Since the 
segmentation map 402 is uniform, the superimpositlon does not change the appearance of the original image 401 . In 
one embodiment, the segmentation process attempts to assign a scale to the different regions on the basis of the 
underiying features. The color-bar 403 on the right shows the scales assigned to the different regions. Since the image 
401 has uniform features, the algorithm has uniformly assigned the scale 3 to all regions in the image 401. In one 
embodiment, the image coders in these examples used JPEG 2000 Part I reversible wavelet filters, five levels of 
decomposition, code-block size 32x32, and a bit rate of 0.2 bits per pixel on gray scale images. 
[0059] The results can be extended to color images. A linear or non-linear combination of the multi-scale entropy 
allocations among the different color components can be used for segmentation. Segmentation can be performed on 
only one component such as luminance or green. A segmentation algorithm can be run on each component separately, 
and then combined using voting or by a MAP method. 

[0060] In one embodiment, the resolution of the final results are limited by the granularity (coarseness) of the multi- 
scale entropy distribution; typically, the resolution of the final results with respect to the underlying image is limited to 
multiples of the code-block size. In one embodiment, when precincts are employed, better resolution can be obtained 
if the precinct boundaries cause the code blocks to be split. 



A(B.S„)=P(BIS„).P(S„). 



(20) 
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Automatic Resolution Selection 

[0061] It is often desirable to know the best scale such that even if all finer scale coefficients are thrown away, the 
retained coefficients contain sufficient infonnation to identify the image. This may be used, for example, with digital 

5 cameras. Since entropy is a good measure for visual information, this may be used as a measure for the amount of 
visual information that is lost when an image is represented at scale / Furthermore, the multi-scale representation 
helps to identify the approximate areas In the Image that lose their visual information during image scaling. The best 
scale Is estimated as follows. For each scale /, the Importance of a given group of multi-scale coefficients SjjXo recon- 
struct the respective part of the image is analyzed. The relative importance of the coefficient Is Inferred by comparing 

10 their entropy to a scaled factor of the mean entropy from the immediate coarser level /f 1 or combination of all coarse 
levels j+1 ....J. Sjj is significant if B < and insignificant if B < Pp-j+i, where is the mean of the number of bits 
per block at the coarser scale /f 1 , and p Is a threshold parameter that dictates significance. In one embodiment, p is 0.3. 
[0062] For each scale / measure the percentage Pij) of the Image area that the significant coefficients at level / 
cover. P{j) measures the area that would lose a significant amount of information, if the significant coefficients at level 

IS j are thrown away (when the image X is scaled down by a factor 2J, then all coefficients at levels 1 ...j are lost in the 
scaled down image). The coarsest possible scale J^pf is chosen so that at least percent of the area is still significant, 
I.e., 



^ P(J^,)>P'. (21) 

where P* is a threshold parameter that sets the minimum percentage of area that needs to be recognizable, in one 
embodiment, P*equals 35%. The best scale that retains sufficient infonnation about the image is 

^opt Hence, even if 

the image is scaled down by a factor of 2*^opr^ on all sides, the image would still contain sufficient infonnation in the 
^s remaining coefficient to facilitate recognition of the Image. It is possible to also set the significance threshold based on 
all the coarser scale coefficients, or based on only some of the coarser scale coefficients. 

[0063] Figure 5 is a flow diagram of one embodiment of a process for adaptively scaling an image. In process block 
501 , a file Is received that contains a header that contains multi-scale entropy distribution infonnation on blocks of an 
image. In one embodiment, the file represents an image In JPEG 2000 fomriat. In process block 502, for each block, 
it is determined that the block retains significance at a scale upon a determination that an entropy of a multi-scale 
coefficient of a block at the scale Is greater than a mean entropy of multi-scale coefficients of blocks In at least one 
coarser scale. In one embodiment, the mean entropy is a mean bit distribution multiplied by a threshold parameter. In 
process block 503, the image Is scaled to a coarsest scale at which a threshold percentage of the blocks retain signif- 
icance at the scale. 

35 [0064] Figure 6 illustrates adaptive scaling of an exemplary image of a woman. The size of the original image 601 
Is 512 by 512 pixels. The size of the scaled Image 602 Is 64 by 64 pixels. The black boxes 603, 604, and 605 display 
the different possible choices for the scaled image size. In one embodiment, the scaling Is detennined using a recog- 
nizable area of 35% and a significance threshold factor of 0.3. Figure 7 illustrates adaptive scaling of an exemplary 
image of Japanese text. The size of the original image 701 Is 512 by 512 pixels. The size of the scaled Image 702 is 
128 by 128 pixels. The black boxes 703 and 704 display the different possible choices for the scaled Image size. In 
one embodiment, the scaling is detennined using a recognizable area of 35% and a significance threshold factor of 
0.3. The scale selection algorithm set forth above may choose different scales for different images. The image 601 of 
the woman is, according to one embodiment, down-sampled by a factor of 2^ times, while the image 701 of Japanese 
text Is, according to one embodiment, down-sampled by a factor of 2^. The differences in the scales arises because 
the Japanese text image 701 has important components (reflected as higher entropy) In the higher frequency bands 
relative to the Image 601 of the woman. 

[0065] Given the significance threshold p, the labeling of a codeblock as significant or Insignificant can be also per- 
formed by modeling the entropy of all the codeblocks In one resolution level as a mixture of two probability distributions, 
e.g., two Gaussian distribution with different mean ^.^ and ^2> ^^d different standard deviations, and 02- ^'^om the 
so entropy values smaller than the significance threshold, the parameters and are estimated. Given those two pdfs 
f^ and f2. the probability of an entropy value x belonging to f^ Is estimated. This method is a standard procedure as 
explained, e.g., Duda, Hart, Stork, Pattern Classification (2"*^ ed.), Wiley, New York, NY 2000. The probability distribu- 
tion of the codeblocks at each resolution is then fed Into a multiscale segmentation algorithm as described above. 
[0066] Given the significance threshold p, the optimal scale J^pi^ can also be selected as: 

55 
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= argminci XB,(i,k) S (BZ IBj(i.k)). (22) 



Fixed-Size-Window Automatic Cropping and Scaling 

[0067] Often, an image is constrained to be represented within a fixed size in pixels. Under such constraints, it is 
10 desirable to choose the "best" representation of the image that satisfies the given size constraints. Since entropy is a 
good measure for visual information, an image representation is obtained that encompasses the maximum entropy, 
while still satisfying the size constraints. ^ 

[0068] The weighted cumulative entropy B^'^^'from equation (6) is used as an input to a maximization algorithm for 
determination of the best scale (or resolution^ level) of an image convolved with a local indicator function. The weights 
IS may be chosen as in the segmentation section as 7j | = 0 for 1 < j and Yjj =1 for 1 ^ j 

[0069] A two dimensional indicator function I is constructed with support dictated by the shape and size constraints 
of the application. For example, If the desired shape constraint Is a rectangle and the size constraints are the pixel 
dimensions m x n, then the indicator function for a rectangle of size m x n located at position (Xo.yo) is given by 



20 



25 



[0, otherwise 



The "best" location (a*,b*) of the rectangle placed at the "best" level j* is computed as 



{(a*,b*)J*}=argmax..,j ixBj (p,(DI,i^.ta,bKp.q)x^(a.b). (24) 



where Kj(a,b) is a matrix that controls the relative spatial and scale importance of the entropy. The size of i^(a,b) is the 
35 same at all scale. (In order to deal with images that are centered, as well as to incorporate the natural human tendency, 

a heuristic that Is not incorporated into most image coders, the K{a,b) are typically chosen such that the central portions 

of the image are weighed more than the entropies at the edges of the image.) For 512 x 512 images with 32x32 

codeblock size, an example is given. An example for a set of spatial importance weighting matrices for ] = 1 ,2,3,4 is 

K2=K^=masl<1 *64/||mask1 1|, 
40 maski =[(1.01.1 1.2 1.3 1.3 1.2 1.1 1.0)x(1.01.1 1.2 1.3 1.3 1.2 1.1 1.0)T], 

1C3, =mask2*64/||mask2||, 

mask2=[11111111]x[11111111] 

and ||mask1|| denotes the L"" - norm of the masking matrix. 

[0070] Multiplying the cumulated weighted entropy at resolution j with maski means weighting the entropy values 
45 linearly decreasing from 1 to 0.77 from the center towards the edges of the image at resolution j. 

[0071] The best representation of the image is then obtained by theoretically computing the image at resolution 
and cropping out of that low-resolution image a rectangle of size m x n located with the lower left comer at position 
(a72r,bV2n. This procedure is practically done by decoding only the codeblocks of the JPEG 2000 codestream that 
contribute to that cropped part of the j* -reduced resolution image and perfomiing an inverse transform on those data 
50 to create actually cropped image. 

[0072] Figure 8 is a flow diagram of one embodiment of a process for automatteally scaling and cropping an image. 
The process is performed by processing logic that may comprise hardware (e.g., circuitry, dedicated logic, etc.), soft- 
ware (such as Is run on a general purpose computer system or a dedicated machine), or a combination of both. 
[0073] Referring to Figure 8, at processing block 801 , a file that contains a header that contains multi-scale entropy 
55 distribution infomnation on blocks of an image is received, along with a shape constraint, such as display width and 
display height. In one embodiment, the file represents an image in JPEG 2000 fomiat. 

[0074] At processing block 802. for each block and for each scale of a set of scales, a cumulative entropy distribution 
for the block at that scale Is set equal to a weighted summation of a number of bits spent to code the block for scales 
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at and between the first scale and a maximum scale. 

[0075] At processing block 803, for each block and each scale chosen from a set of scales, and each width and 
height offset chosen within a given Image width and height, an indicator function of the block at the chosen scale and 
chosen width and height offset is set to one upon detemninlng that a width location of the block Is not greater than a 

5 first minimum value of a set consisting of the chosen width offset and a sum of the chosen width offset with the display 
width scaled by the chosen scale, and the height location of the block is not greater than a first minimum value of a 
set consisting of the chosen height offset and a sum of the chosen height offset with the display height scaled by the 
chosen scale. Otherwise, the Indteator function is set to zero otherwise. The first minimum value is a minimum value 
of a set consisting of a width of the image and a sum of the width of the block plus one plus a desired height seated 

10 by the first scale. The second minimum value is a minimum value of a set consisting of a height of the image and a 
sum of the height of the block plus one plus a desired width scaled by the first scale. 

[0076] At processing block 804, an optimal location (width and height offset) and an optimal scale are computed that 
together maximize a summation consisting of the cumulative entropy distribution for the block, multiplied by the indicator 
function of the block (characterized by the scale, width and height offset) and with a parameter. At processing block 

15 805, the image is cropped to the optimal location and the resulting cropped image is down-sampled to the optimal scale. 
[0077] in one embodiment, the above process simultaneously chooses the region and its scaling factor for the im- 
ages. Figure 9 illustrates automatic scaling and cropping of an exemplary image of a woman according to one embod- 
iment. The size of the original image 901 is 512 by 512 pixels. The maximum size of the representation 902 is con- 
strained to be 192 by 192 pixels. To accommodate the final representation 902 within 192 by 192 pixels, the process 

20 selects the important face region of the woman, and then scales it down by a factor of two. The fixed size representation 
902 does not contain the unimportant background regions. The black box 903 displays the region with respect to the 
originai image that Is being considered in the representation. 

[0078] Figure 1 0 illustrates automatic scaling and cropping of an exemplary image of Japanese text. The size of the 
original image 1001 is 512 by 512 pixels. The maximum size of the representation 1002 is constrained to be 1 92 by 
25 1 g2 pixels. The algorithm's best 1 92 by 1 92 pixel representation 1 002 for the Japanese text image 1 001 is simply the 
whole image 1 001 scaled down appropriately. The black box 1 003 displays the region with respect to the original image 
that is being considered in the representation. The whole Japanese text Image 1001 is scaled down to obtain the 
representation 1002. 

30 Display Constraints 

[0079] Display space is often a constraint on any device. Under such circumstances, it is desirable to obtain a device 
dependent, meaningful, condensed representation of images. By combining header-based processing with display 
adaptation techniques, a variety of meaningful and condensed image representations can be provided. The display 
35 device characteristics set an upper and lower bound on the size of the image to be represented. Since the automatic 
scaling process set forth above suggests a scale which ensures that most of the Image infomnation is still retained in 
the scaled down image, a scale can be chosen between the bounds dk;tated by the display device that is closest to 
the suggested scale. 

[0080] Often, the size (e.g., in pixels) available to represent an image is fixed. In such a case, it is desirable to find 
40 the best representation of the image that can be accommodated with in the available pixels. The automatic region 
selection and scaling technique set forth above can provide the best fixed-size representation of the image, by exploiting 
the mutti-scate entropy distribution. The parameters in the process can be chosen to tune the representation to specific 
display devices. 

45 Applications 

[0081] One approach to compressing digital video sequences is to compress each video frame independently using 
a multi-resolution image coder. For example, the Motion JPEG 2000 standard uses multi-scale transform-based com- 
pression on each video frame independently. Since our proposed algorithms can effectively process these frames, the 

50 aforementioned processing can be applied to Motion JPEG 2000 as well. For example, by setting the segmentation 
process parameters such as a and Yy,/^ appropriately, "active" regions, such as people from the background in a single 
video frame, can be identified. This can be utilized to allocate more bits to the active regions in the next frame, so that 
the people can be better identified if required. Significant changes in the entropy allocation with time across frames 
can also be exploited to detect motion in the video. This may have special applications in surveillance cameras. 

55 [0082] An aim of image classification is to automatically sort through an image database, and group images of similar 
types such as natural images, portraits, documents, unifonm textures, etc. Segmentation maps obtained by processing 
the multi-scale entropy distributions can be an exploited as a feature to perfomi broad classifk:ations. The classification 
can be fine-tuned later using more intensive and specialized processing. 
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[0083] An aim of Image retrieval Is to Identify images that are similar to some template image. Since good Image 
retrieval algorithms are intensive and require the actual image to perform their analysis, header-based segmentation 
maps can be exploited to reduce the number of images that need to be decoded and fed to the specialized image- 
retrieval algorithms. 

5 [0084] The segmentation process set forth above can be used to provide an approximate segmentation that splits 
the image into regions containing coarse scale features and regions containing fine scale features. For example, In 
document images, the segmentation algorithm can approximately distinguish the text regions from the images. The 
approximate segmentation can be Input to a more Intensive pattern analysis algorithm such as optical character rec- 
ognition ("OCR") for further analysis. 

10 [0085] The segmentation technique set forth above can be used to create an abstract collage representation of the 
image, where different regions of the image are scaled more (or less) depending on whether the features contained 
In the region are coarse or fine. Such an abstract representation of an Image can possibly be used in many graphical 
user Interface ("GUI") Image communication applications such as web-browsers. 

15 Muftiscale Collage 

[0086] For the calculation of a multiscale collage of an image as a first step a segmentation as in Segmentation 
section described above is performed. After this, rectangles are fitted to the segrnented image in the following way 
[0087] A multi-scale probability distribution such ^s the MAP cost function A(BP''^®' , S^) from Equation (6) or the 

20 result of a monotonic transformation such as log A(BP'*®', S^,) is used as an input to a technique for fitting rectangles. 
The goal Is to find at each level j the rectangle whose probabilities are most simitar to the probability at a larger level 
j*. That means the content inside the rectangle has most likely meaningful content at alt scales m, j<m^j*. Its corre- 
sponding Image part is therefore likely to be well represented at resolution j. Once the rectangle is found the locations 
of entries covered by the rectangle are marked as already counted' while the corresponding probability values are 

25 penalized by adding a large value (e.g., 1 0). Once this procedure has been perfonned for all levels, the rectangle and 
level is chosen that yields the minimal difference in probabilities to the rectangle at level j*. The position and size of 
the rectangle as well as the associated level is saved in a list. In the next iteration step the procedure is applied again 
to the penalized probabilistic distribution function (pdf) unit all codeblock locations of the image labeled as "already 
counted". The information in the final list represents a rectangular multiscale partition of the Image. 

30 [0088] Figure 11 A is a schematic diagram of an apparatus to segment an image, to adaptively scale an image, or to 
automatically scale and crop an image. Referring to Figure 11 A, the apparatus 1101 comprises a receiving unit 1102 
to receive a file that contains a header that contains multi-scale entropy distribution infomiation on blocks of an image. 
In one embodiment, the file represents an image in J PEG 2000 fomriat. The apparatus 1 1 01 further comprises a process- 
ing unit 1103 coupled with the receiving unit 1102. In one embodiment, the processing unit 1103 is to, for each block, 

35 assign to the block a scale from a set of scales that maximizes a cost function. The cost function is a product of a total 
likelihood and a prior. The total likelihood Is a product of likelihoods of the blocks. In one embodiment, each likelihood 
of a block is proportional to a summation, for each scale in the set of scales, of a product of a weight of the scale and 
a number of bits spent to code the block at the scale. In one embodiment, the number of bits spent to code the block 
at the scale is a numerator divided by a denominator. The numerator is an entropy distribution of a multi-scale coefficient 

40 of the block at the scale. The denominator Is four raised to the power of the scale. 

[0089] In one embodiment, processing 1103 groups together blocks that have been assigned equivalent scales to 
segment the Image. In one embodiment, processing unit 1103, for each block, detemnlnes that the block retains sig- 
nificance at a scale upon detemiining that an entropy of a multi-scale coefficient of a block at the scale is greater than 
a mean entropy of multi-scale coefficients of blocks In at least one coarser scale. In one embodiment, the mean entropy 

45 is a mean bit distribution multiplied by a threshold parameter 

[0090] In one embodiment, processing unit 1 1 03 further scales the Image to a coarsest scale at which a threshold 
percentage (e.g., 35% as described above as the threshold parameter P* of the blocks retain significance at the scale. 
[0091] Processing unit 1103, may, for each block and for each first scale of a set of scales, set a cumulative entropy 
distribution for the block at the first scale equal to a summation of a number of bits spent to code the block for scales 

50 at and between the first scale and a maximum scale. 

[0092] Processing unit 1 1 03 may, for each block and for each first scale of a set of scales, set an indicator function 
of the block and the first scale to one upon determining that a width of the block is not greater than a first minimum 
value and a height of the block is not greater than a second minimum value and to zero otherwise. The first minimum 
value and second minimum values are the same as described in Figure 8. 

55 [0093] In one embodiment, processing unit 1103 further computes an optimal location and an optimal scale that 
together maximize a summation, for each block in the optimal location at the optimal scale, of the cumulative entropy 
distribution for the block at the optimal scale, multiplied by the Indicator function of the block and the optimal scale, 
multiplied by a parameter (e.g., k described above). 
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[0094] Then, processing unit 1103 crops the image to the optimal location and down-sample a resulting cropped 
image to the optimal scale. 

[0095] Figure 11 B is block diagram of one embodiment of a codestream processor for use in an Image processing 
system. Referring to Figure 11 B, codestream 1121 is received by header extractor 1 1 22 that extracts header informa- 
5 tion. Segmentation unit 1 1 23 performs segmentation on the codestream using the extracted header information from 
header extractor 1122. In one embodiment, segmentation unit 1123 detennines which codebloclcs of codestream 1121 
to decode and signals decoder 1 1 24. Decoder 1 1 24 decodes codeblocks necessary for the segmented image portion 
(e.g., a region at a specified resolution.) 

10 Quantitative Example 

[0096] The value of header-based processing is demonstrated in the example of creating a good 128x128 thumbnail 
representation of 1024x1024 image. An image analysis process described herein Is the one for automatic cropping 
and scaling as described above. The complexity of processed data compared to traditional image processing of a 
15 JPEG 2000 Image and a raster image Is listed in Table 3. The advantage over an image in JPEG 2000 fomri Is that 
only 1/1000 of the data must be used by the segmentation algorithm and less than 1/2 of data must be decoded. 



Table 3 - 



Quantitative example of header-based processing 




Multiscale fon/vard wavelet 
transform 


Amount of data processed 
by segmentation algorithm. 


Amount of decoding 


Raster image 
JPEG 2000 
JPEG 2000 (header 
processing) 


1 024 X 1024 


1 024 x 1024 
1024x1024 
32x32 (0.001% of above) 


1024x1024 
- 33% of above 



An Exemplary Computer System 



[0097] Figure 1 2 is a block diagram of an exemplary computer system that may perform one or more of the operations 
described herein. Referring to Figure 12, computer system 1200 may comprise an exemplary client 1250 or server 
1200 computer system. Computer system 1 200 comprises a communication mechanism or bus 1 211 for communicating 
infomnatlon, and a processor 1212 coupled with bus 1211 for processing Infomiatlon. Processor 1212 includes a mi- 
croprocessor, but is not limited to a microprocessor, such as, for example, Pentium''"'**, PowerPC"", etc. 
[0098] System 1200 further comprises a random access memory (RAM), or other dynamic storage device 1204 
(referred to as main memory) coupled to bus 1211 for storing Information and Instructions to be executed by processor 
1212. IVtaIn memory 1204 also may be used for storing temporary variables or other intermediate information during 
execution of instructions by processor 1212. 

[0099] Computer system 1200 also comprises a read only memory (ROM) and/or other static storage device 1206 
coupled to bus 121 1 for storing static Infomriation and Instructions for processor 1212, and a data storage device 1207, 
such as a magnetic disk or optical disk and its corresponding disk drive. Data storage device 1207 Is coupled to bus 
1211 for storing information and instructions. 

[0100] Computer system 1 200 may further be coupled to a display device 1 221 . such as a cathode ray tube (CRT) 
or liquid crystal display (LCD), coupled to bus 1211 for displaying Infonnation to a computer user. An alphanumeric 
input device 1222, Including alphanumeric and other keys, may also be coupled to bus 1211 for communicating infor- 
mation and command selections to processor 121 2. An additional user input device Is cursor control 1223, such as a 
mouse, trackball, trackpad, stylus, or cursor direction keys, coupled to bus 1211 for communicating direction information 
and command selections to processor 1212, and for controlling cursor movement on display 1221 . 
[0101] Another device that may be coupled to bus 1211 is hard copy devk:e 1224, whteh may be used for printing 
instructions, data, or other Information on a medium such as paper, film, or similar types of media. Furthermore, a 
sound recording and playback device, such as a speaker and/or microphone may optionally be coupled to bus 1211 
for audio interfacing with computer system 1200. Another device that may be coupled to bus 1211 is a wired/wireless 
communication capability 1225 to communication to a phone or handheld palm device. 

[0102] Note that any or all of the components of system 1 200 and associated hardware may be used in the present 
invention. However, it can be appreciated that other configurations of the computer system may include some or all of 

the devices. 

[0103] Whereas many alterations and modifications of the present invention will no doubt become apparent to a 



17 



EP1 329 847A1 



person of ordinary skill in the art after having read the foregoing description, it Is to be understood that any particular 
embodiment shown and described by way of illustration is in no way intended to be considered limiting. Therefore, 
references to details of various embodiments are not intended to limit the scope of the claims which in themselves 
recite only those features regarded as essential to the invention. 

5 

Claims 

1. A method comprising: 

10 

generating a granular entropy distribution using infomiation obtained from a header of a compressed bitstream; 
and 

applying one or more image processing operations based on the granular entropy distribution. 

IS 2. The method defined in Claim 1 further comprising decoding only a portion of coded data in the compressed bit- 
stream as part of applying the one or more image processing operations. 

3. The method defined in Claim 1 further comprising assigning a class label based on the header. 

20 4. An article of manufacture having one or more recordable medium with executable Instructions stored thereon 
which, when executed by a system, cause the system to: 

generate a granular entropy distribution using infonnatlon obtained from a header of a compressed bitstream; 
and 

25 apply one or more image processing operations based on the granular entropy distribution. 

5. The article of manufacture defined in Claim 4 further comprising instructions which, when executed, cause the 
system to decode only a portion of coded data in the compressed bitstream as part of applying the one or more 
image processing operations. 

30 

6. The article of manufacture defined in Claim 4 further comprising instructions which, when executed, cause the 
system to assign a class label based on the header. 

7. An apparatus comprising: 

35 

means for generating a granular entropy distribution using information obtained from a header of a compressed 
bitstream; and 

means for applying one or more image processing operations based on the granular entropy distribution. 

40 8. The apparatus defined in Claim 7 further comprising decoding only a portion of coded data in the compressed 
bitstream as part of applying the one or more image processing operations. 

9. The apparatus defined in Claim 7 further comprising assigning a class label based on the header. 

45 10. A method comprising: 

perfomiing image analysis on a codestream based on header infomiation In the codestream; and 
decoding only coded data in one or more image portions specified by outputs of the image analysis. 

so 11. The method defined in Claim 10 wherein perfomiing image analysis comprises performing segmentation. 

12. The method defined in Claim 11 wherein segmentation uses a Maximum A Posterari approach. 

13. The method defined In Claim 10 wherein performing image analysis comprises performing classification. 

55 

14. The method defined in Claim 10 wherein perfomiing image analysis includes extracting a granular entropy distri- 
bution. 
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15. An apparatus comprising: 

means for perfomning image analysis on a codestream based on header information in the codestream; and 
means for decoding only coded data in one or more image portions specified by outputs of the image analysis. 

5 

16. The apparatus of Claim 15 wherein the means for perfonning Image analysis comprises means for perfomning 
segmentation. 

1 7. The apparatus of Claim 1 6 wherein the means for performing segmentation uses a IVIaximum A Posterari approach. 

10 

18. The apparatus of Claim 15 wherein the means for perfomning image analysis comprises means for perfonning 
classification. 

19. The apparatus of Claim 15 wherein the means for performing image analysis comprises means for extracting a 
15 granular entropy distribution . 

20. An article of manufacture having one or more recordable medium with executable instructions stored thereon 
which, when executed by a system, cause the system to: 

20 perfomn image analysis on a codestream based on header infomriation in the codestream; and 

decode only coded data in one or more image portions specified by outputs of the image analysis. 

21. The article of manufacture of Claim 20 further comprising instructions which, when executed, cause the system 
to perfonn segmentation. 

25 

22. The article of manufacture of Claim 20 further comprising instructions which, when executed, cause the system 
to perfonn classification. 

23. The article of manufacture of Claim 21 further comprising instructions which, when executed, cause the system 
30 to perform segmentation using a Maximum A Posterari approach. 

24. An article of manufacture of Claim 20 further comprising instructions which, when executed, cause the system to 
extract a granular entropy distribution. 

35 25. A method comprising: 

extracting header infonnation from codestream having encoded image data; 

perfonning segmentation on the codestream based on the header infonnation independent of decoding en- 
coded image data; 

^0 decoding encoded image data necessary to represent a segmented image portion. 

26. The method defined in Claim 25 further comprising extracting a granular entropy distribution. 

27. The method defined in Claim 25 wherein performing segmentation occurs prior to decoding encoded image data. 

45 

28. The method defined in Claim 27 wherein the segmented image portion comprises a region of an image at a specific 
resolution. 

29. An apparatus comprising: 

50 

means for extracting header information from codestream having encoded image data; 
means for perfonning segmentation on the codestream based on the header infonnation independent of de- 
coding encoded image data; 

means for decoding encoded Image data necessary to represent a segmented image portion. 

55 

30. The apparatus of Claim 29 further comprising means for extracting a granular entropy distribution. 

31. The apparatus of Claim 29 wherein the means for perfonning segmentation perfomns segmentation prior to de- 
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coding encoded image data. 

32. The apparatus of Claim 31 wherein the segmented Image portion comprises a region of an Image at a specific 
resolution. 

5 

33. An article of manufacture having one or more recordable medium with executable instructions stored thereon 
which, when executed by a system, cause the system to: 

extract header Information from codestream having encoded Image data; 
10 perfomri segmentation on the codestream based on the header infomnation independent of decoding encoded 

Image data; 

decode encoded image data necessary to represent a segmented image portion. 

34. The article of manufacture of Claim 33 further comprising Instructions which, when executed, cause the system 
15 to extract a granular entropy distribution. 

35. The article of manufacture of Claim 33 further comprising instructions which, when executed, cause the system 
to perfomn segmentation prior to decoding encoded Image data. 

20 36. The article of manufacture of Claim 35 wherein the image portion comprises a region of an image at a specific 
resolution. 

37. A method comprising: 

25 receiving header infonnatlon corresponding to a bit stream of multi-scale transform-based compressed data 

representing Image data; 

generating a feature vector con^espondlng to Image description bits in the bit stream from the header Infomia- 
tion; and 

perfomning one or more operations on at least a portion of the bit stream based on the feature vector. 

30 

38. The method defined in Claim 37 further comprising generating a distribution of the number of zero bit planes in 
one or more portions of compressed data, the distribution derived from the heading infomnation. 

39. The method defined in Claim 37 further comprising generating an entropy distribution based on the header infor- 
ms matlon. 

40. The method defined In Claim 39 wherein the entropy distribution Is granular. 

41 . The method defined in Claim 39 wherein the entropy distribution comprises a map of bit distribution for the Image 
40 data. 

42. The method defined In Claim 39 wherein the entropy distribution Is a length of coded data for codeblocks. 

43. The method defined in Claim 37 wherein the header Information is part of a JPEG 2000 file. 

45 

44. The method defined in Claim 37 wherein one of the one or more operations comprises classification. 

45. An apparatus comprising: 

50 means for receiving header information corresponding to a bit stream of multi-scale transfomi-based com- 

pressed data representing Image data; 

means for generating a feature vector corresponding to image description bits In the bit stream from the header 
Information; and 

means for performing one or more operations on at least a portion of the bit stream based on the feature vector. 



55 



46. The apparatus of Claim 45 further comprising means for generating a distribution of the number of zero bit planes 
In one or more portions of compressed data, the wherein distribution is derived from the header Infonnatlon. 
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47. The apparatus of Claim 46 further comprising means for generating an entropy distribution based on the header 
infomnation. 

48. The apparatus of Claim 47 wherein the entropy distribution is granular. 

5 

49. The apparatus of Claim 47 wherein the entropy distribution comprises a map of bit distribution for the image data. 

50. The apparatus of Claim 47 wherein the entropy distribution is a length of coded data for codeblocks. 
10 51 . The apparatus of Claim 45 wherein the header infomnation Is part of a JPEG 2000 file. 

52. The apparatus of Claim 45 wherein one of the one or more operations comprises classification. 

53. An article of manufacture having one or more recordable medium with executable instructions stored thereon 
IS which, when, executed by a system, cause the system to: 

receive header Information corresponding to a bit stream of mutti-scale transform-based compressed data 
representing image data; 

generate a feature vector corresponding to image description bits in the bit stream from the header Information; 
20 and 

perfonn one or more operations on at least a portion of the bit stream based on the feature vector. 

54. A method for segmenting an image comprising: 

25 receiving a header that contains multi-scale entropy distribution infomnation on blocks of an image; 

for each block, assigning to the block a scale from a set of scales that maximizes a cost function, wherein the 
cost function is a product of a total likelihood and a prior, wherein the total likelihood Is a product of likelihoods 
calculated using the header of the block; and 

segmenting the Image by grouping together blocks that have been assigned equivalent scales. 

30 

55. The method of Claim 54, wherein the file represents an Image In JPEG 2000 fomnat. 

56. The method of Claim 54, wherein each likelihood of a block Is proportional to a summation, for each scale in the 
set of scales, of a product of a weight of the scale and a number of bits spent to code the block at the scale. 

35 

57. The method of Claim 56, wherein the number of bits spent to code the block at the scale is a numerator divided 
by a denominator, wherein the numerator is an entropy distribution of a multi-scale coefficient of the block at the 
scale, and wherein the denominator Is four raised to the power of the scale. 

40 58. A method for adaptively scaling an image comprising: 

receiving a header that contains multi-scale entropy distribution Infonmation on blocks of an image; 
for each block, determining that the block retains significance at a scale upon detemnining that an entropy of 
a multi-scale coefficient of a block at the scale is greater than a mean entropy of multi-scale coefficients of 
45 blocks in at least one coarser scale; and 

scaling the image to a coarsest scale at which a threshold percentage of the blocks retain significance at the 
scale. 

59. The method of Claim 58, wherein the file represents an Image in JPEG 2000 fomnat. 

50 

60. The method of Claim 58, wherein the mean entropy Is a mean bit distribution multiplied by a threshold parameter. 

61. The method of Claim 58 wherein the scale is selected based on following equation: 

55 

=argnnn(i 2B,(i,lc)^{Bi IBj(i.k)). 

J fcij M U 
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62. A method for automatically scaling and cropping an image, comprising: 

receiving a file that contains a header that contains multi-scale entropy distribution information on blocks of 
an image; 

5 for each block and for each scale of a set of scales: 

setting a cumulative entropy distribution for the block at a scale equal to a weighted summation of a number 

of bits spent to code the block for scales at and between a first scale and a maximum scale; and 

for each width and height offset within a given image width and height, setting an Indicator function of the 

10 block at the chosen scale and chosen width and height offsets to one upon determining that a width location 

of the block is not greater than a first minimum value and a height location of the block is not greater than 
a second minimum value, wherein the first minimum value Is a minimum value of a set consisting of a 
chosen width offset and a sum of the chosen width offset with the display width scaled by the first scale, 
and wherein the second minimum value is a minimum value of a set consisting of a chosen height offset 

15 and a sum of the chosen height offset with the display height scaled by the first scale; 

computing a location and scale that together maximize a summation consisting of the cumulative entropy 
distribution for the block at the optimal scale multiplied with an indicator function of the block and by a 
parameter; and 

cropping the image to the optimal location and down-sampling a resulting cropped image to the optimal 
20 scale. 

63. The method defined in Claim 62 wherein the block is characterized by scale, width and height offsets. 

64. The method of Claim 62, wherein the file represents an image in JPEG 2000 fomnat. 

25 

65. A method comprising: 

segmenting an image generating a rectangular multi-scale partition of the image based on a multi-scale prob- 
ability distribution; and 

30 generating a rectangular multi-scale partition of the image based on the multi-scale probability distribution. 

66. The method in Claim 65 wherein generating the rectangular multi-scale partition of the image comprises fitting 
rectangles to the segmented image based on the multi-scale probability distribution, wherein filling rectangles to 
the segmented Image includes finding a rectangle at each scale whose probabilities are similar to a probability at 

35 a higher scale such that content of the image in the rectangle is represented at a resolution associated with that 

scale. 

67. The method defined in Claim 65 further comprising: 

40 storing the rectangle; and 

repeating the filling operation for at least one other rectangle. 

68. The method defined in Claim 65 further comprising choosing the rectangle and scale with minimal difference in 
probabilities to the rectangle at a higher scale. 

45 

69. An apparatus comprising: 

means for segmenting an image generating a rectangular multi-scale partition of the image based on a multi- 
scale probability distribution; and 
50 means for generating a rectangular multi-scale partition of the Image based on the multi-scale probability 

distribution. 

70. The apparatus defined in Claim 69 wherein the means for generating the rectangular multi-scale partition of the 
Image comprises means for fitting rectangles to the segmented image based on the multi-scale probability distrl- 

55 bution, wherein the means for filling rectangles to the segmented Image includes means for finding a rectangle to 

each scale whose probabilities are similar to a probability at a higher scale such that content of the image in the 
rectangle is represented at a resolution associated with that scale. 
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71. The apparatus defined in Claim 69 further comprising: 

means for storing the rectangle; and 

means for repeating the filling operation for at least one other rectangle. 

5 

72. The apparatus defined in Claim 65 further comprising means for choosing the rectangle and scale with minimal 
difference in probabilities to the rectangle at a higher scale. 

73. An article of manufacture having one or more recordable medium with executable instructions stored thereon 
10 which, when executed by a system, cause the system to: 

generate a rectangular multi-scale partition of an image based on a multi-scale probability distribution; and 
generate a rectangular multi-scale partition of the image based on the multi-scale probability distribution. 

15 74. An article of manufacture having one or more recordable media with executable instructions stored thereon which, 
when executed by a machine, cause the machine to: 

receive a header that contains multi-scale entropy distribution infonnation on blocl<s of an image; 
for each block, assign to the block a scale from a set of scales that maximizes a cost function, wherein the 
20 cost function is a product of a total likelihood and a prior, wherein the total likelihood Is a product of likelihoods 

of the blocks; and 

segment the image by grouping together blocks that have been assigned equivalent scales. 

75. The article of manufacture of Claim 74, wherein the file represents an Image In JPEG 2000 fonnat. 

25 

76. The artble of manufacture of Claim 74, wherein each likelihood of a block is proportional to a summation, for each 
scale in the set of scales, of a product of a weight of the scale and a number of bits spent to code the block at the 
scale. 

30 77. The article of manufacture of Claim 76, wherein the number of bits spent to code the block at the scale is a numerator 
divided by a denominator, wherein the numerator is an entropy distribution of a multi-scale coefficient of the block 
at the scale, and wherein the denominator is four raised to the power of the scale. 

78. An article of manufacture having one or more recordable media with executable instructions stored thereon which, 
35 when executed by a machine, cause the machine to: 

receive a file that contains a header that contains multi-scale entropy distribution infonnation on blocks of an 

image; 

for each block, detennine that the block retains significance at a scale upon determining that an entropy of a 
40 mu Iti-scale coefficient of a block at the scale is greater than a mean entropy of multi-scale coefficients of blocks 

in at least one coarser scale; and 

scale the image to a coarsest scale at whteh a threshold percentage of the blocks retain significance at the 
scale. 

"^5 79. The article of manufacture of Claim 78, wherein the file represents an image in JPEG 2000 format. 

80. The article of manufacture of Claim 78, wherein the mean entropy is a mean bit distribution multiplied by a threshold 
parameter. 

50 81. An article of manufacture having one or more machine-readable media storing executable instruction thereon 
which, when executed by a machine, cause the machine to: 

receive a header that contains multi-scale entropy distribution infonnation on blocks of an image; 
for each block and for each first scale of a set of scales: 

55 

set a cumulative entropy distribution for the block at the first scale equal to a summation of a number of 

bits spent to code the block for scales at and between the first scale and a maximum scale; arid 

set an indicator function of the block and the first scale to one upon detennining that a width of the block 
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is not greater than a first minimum value and a height of the block Is not greater than a second minimum 
value and to zero otherwise, wherein the first minimum value is a minimum value of a set consisting of a 
width of the Image and a sum of the width of the block plus one plus a desired height seated by the first 
scale, and wherein the second minimum value is a minimum value of a set consisting of a height of the 
5 image and a sum of the height of the block plus one plus a desired width scaled by the first scale; 

compute an optimal location and an optimal scale that together maximize a summation, for each block In 
the optimal location at the optimal scale, of the cumulative entropy distribution for the block at the optimal 
scale, multiplied by the indicator function of the block and the optimal scale, multiplied by a parameter; and 
crop the image to the optimal location and down-sampling a resulting cropped image to the optimal scale. 

10 

82. The article of manufacture of Claim 81 , wherein the file represents an Image in JPEG 2000 fonnat. 

83. An apparatus comprising: 

15 a receiving unit to receive a header that contains multi-scale entropy distribution information on blocks of an 

image; and 

a processing unit coupled with the receiving unit, the processing unit to for each block, assign to the block a 
scale from a set of scales that maximizes a cost function, wherein the cost function is a product of a total 
likelihood and a prior, wherein the total likelihood is a product of likelihoods of the blocks; and 
20 group together blocks that have been assigned equivalent scales to segment the image. 

84. The apparatus of Claim 83, wherein the file represents an image in JPEG 2000 fomnat. 

85. The apparatus of Claim 83, wherein each likelihood of a block is proportional to a summation, for each scale in 
25 the set of scales, of a product of a weight of the scale and a number of bits spent to code the block at the scale. 

86. The apparatus of Claim 85, wherein the number of bits spent to code the block at the scale is a numerator divided 
by a denominator, wherein the numerator is an entropy distribution of a multi-scale coefficient of the block at the 
scale, and wherein the denominator is four raised to the power of the scale. 

30 

87. An apparatus to adaptively scale an image, comprising: 

a receiving unit to receive a header that contains multi-scale entropy distribution information on blocks of an 

image; and 

35 a processing unit coupled with the receiving unit, the processing unit to for each block, determine that the 

block retains significance at a scale upon detemiining that an entropy of a multi-scale coefficient of a block at 
the scale is greater than a mean entropy of multi-scale coefficients of blocks in at least one coarser scale; and 
scale the image to a coarsest scale at which a threshold percentage of the blocks retain significance at the 
scale. 

40 

88. The apparatus of Claim 87, wherein the file represents an Image in JPEG 2000 fomnat. 

89. The apparatus of Claim 87, wherein the mean entropy is a mean bit distribution multiplied by a threshold parameter. 
45 90. An apparatus to automatically scale and crop an image, comprising: 

a receiving unit to receive a header that contains multi-scale entropy distribution infonnatlon on blocks of an 

image; and 

a processing unit coupled with the receiving unit, the processing unit to for each block and for each first scale 

50 of a set of scales; 

set a cumulative entropy distribution for the block at the first scale equal to a summation of a number of bits 
spent to code the block for scales at and between the first scale and a maximum scale; and 
set an indicator function of the block and the first scale to one upon detemriinlng that a width of the block is 
not greater than a first minimum value and a height of the block is not greater than a second minimum value 

55 and to zero otherwise, wherein the first minimum value Is a minimum value of a set consisting of a width of 

the image and a sum of the width of the block plus one plus a desired height scaled by the first scale, and 
wherein the second minimum value is a minimum value of a set consisting of a height of the Image and a sum 
of the height of the block plus one plus a desired width scaled by the first scale; 
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compute an optimal location and an optimal scale that together maximize a summation, for each block In the 
optimal location at the optimal scale, of the cumulative entropy distribution for the block at the optimal scale, 
multiplied by the indicator function of the block and the optimal scale, multiplied by a parameter, and 
crop the image to the optimal location and down-sample a resulting cropped Image to the optimal scale. 

91. The apparatus of Claim 90, wherein the file represents an image in JPEG 2000 fomnat. 

92. A method comprising: 

obtaining an estimation of a low bit rate entropy distribution from a high bit rate granular entropy distribution 
using Infomriatlon obtained from a header of a compressed bitstream; and 
applying one or more Image processing operations. 

93. The method defined in Claim 92 wherein obtaining the estimation comprises extracting information from a first 
plurality of layers and ignoring packets in layers other than the first plurality of layers. 

94. The method defined In Claim 92 further comprising determining an order In which bits are allocated. 

95. The method defined in Claim 92 wherein the high bit rate distribution is a non-lossy distribution. 

96. The method defined in Claim 92 wherein the high bit rate distribution Is a lossless distribution. 
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RECEIVE A RLE THAT CONTAINS A 
HEADER THAT CONTAINS MULTI- 
SCALE ENTROPY DISTRIBUTION 
INFORMATION ON BLOCKS OF AN 
IMAGE 



FOR EACH BLOCK. ASSIGN TO THE 
BLOCK A SiCALE FROM A SET OF 
SCALES THAT MAXIMIZES A COST 
FUNCTION, WHEREIN THE COST 
FUNCTION IS A PRODUCT OF A TOTAL 
UKELIHOOD AND A PRIOR. WHEREIN 
THE TOTAL LIKEUHOOD IS A 

PRODUCT OF INDIVIDUAL 
UKEUHOODS OF THE BLOCKS 



202 



SEGMENT T>1E IMAGE BY GROUPING 
TOGETHER BLOCKS THAT HAVE 
BEEN ASSIGNED EQUIVALENT 
SCALES 



203 



RG.2 



27 



EP 1 329 847 A1 




F 1(^.3 



28 



EP 1 329 847 A1 




EP1 329 847A1 



RECEIVE A FILE THAT CONTAINS A 
HEADER THAT CONTAINS MULTl- ^ 501 
SCALE ENTROPY DISTRIBUTION 
INFORMATION ON BLOCKS OF AN 
IMAGE 



FOR EACH BLOCK. DETERMINE IF THE 
BLOCK RETAINS SIGNIHCANCE AT A 
SCALE BY DETERMING WHETHER AN 
ENTROPY OF A MULTI-SCALE 
COEFFICIENT OF A BLOCK AT THE 
SCALE IS GREATER THAN A MEAN 

ENTROPY OF MULTI-SCALE 
COEFFICIENTS OF BLOCKS IN AT 
LEAST ONE COARSER SCALE 



SCALE THE IMAGE TO A COARSEST 
SCALE AT WHICH AT THRESHOLD 
PERCENTAGE OF THE BLOCKS 
RETAIN SIGNIFICANCE AT THE SCALE 
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OlSTRIBUnON INFORMATION ON BLOCKS OF AN IMAGE ANO 
RECEIVe SHAPE CONSTRAINTS SUCH AS OlSPUY WIDTH ANO 
DISPLAY HBGHT 
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FOR EACH BLOCK ANO FOR EACH SCALE OF A SET OF SCALES. 
SET A CUMULATIVE ENTROPY DISTRIBUTION FOR THE BLOCK 

AT THAT SCALE EQUAL TO A weiGHTEO SUMMATION OF A 
NUMBER OF BITS SPENT TO CODE THE BLOCK FOR SCALES AT 

ANO BETWEEN THE FIRST SCALE AND A MAXIMUM SCALE 
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FOR EACH BLOCK, EACH SCALE CHOSEN FROM A SET OF 
SCALES. AND EACH WIDTH ANO HEIGHT OFFSET CHOSEN 
WITHIN THE GIVEN IMAGE WIDTH AND HEIGHT. SET AN 
INDICATOR FUNCTION OF THE BLOCK AT THE CHOSEN SCALE 

AND CHOSEN WIDTH ANO HBGHT OFFSO* TO ONE UPON 
DETERMfNING THAT THE WIDTH LOCATION OF THE BLOCK IS 

NOT GREATER THAN A FIRST MINIMUM VALUE OF A SET 
CONSISTING OF THE CHOSEN WIDTH OFFSET AND A SUM OF 

THE CHOSEN WIDTH OFFSET WITH THE DlSPU^Y WIDTH 
SCALED BY THE CHOSEN SCALE, AND THE HBGHT LOCATWN 
OFTHE BLOCK IS NOT GREATER THAN A FIRST MINIMUM VALUE 
OF A SET CONSISTING OF THE CHOSEN HEIGHT OFFSET AND A 
SUM OF THE CHOSEN HBGHT OFFSET WITH THE DISPLAY 
HEIGHT SCALED BY THE CHOSEN SCALE 
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COMPUTE AN OPTIMAL LOCATION (WIDTH AND HEIGHT 
OFFSET) AND AN OPTIMAL SCALE THAT TOGETHER ^ 
MAXIMIZE A SUMMATWN CONSISTING OF THE CUMULATIVE f 
ENTROPY DISTRIBUTION FOR THE BLOCK MULTIPUED WITH W 
THE INDICATOR FUNCTION OF THE BLOCK (CHARACTERIZED 
BY THE SCALE, WIDTH AND HEIGHT OFFSET) 
ANO WITH A PARAMETER 



CROP THE IMAGE TO THE OPTIMAL'LOCATION ANO DOWN- f 
SAMPLE THE RESULTING CROPPED IMAGE TO THE OPTIMAL 
SCALE 
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